Regret Testing: A Simple Payo¤-Based Procedure for Learning Nash Equilibrium1
نویسنده
چکیده
A learning rule is uncoupled if a player does not condition his strategy on the opponents payo¤s. It is radically uncoupled if a player does not condition his strategy on the opponents actions or payo¤s. We demonstrate a family of simple, radically uncoupled learning rules whose period-by-period behavior comes arbitrarily close to Nash equilibrium behavior in any nite two-person game. Keywords: learning, Nash equilibrium, regret, bounded rationality JEL Classi cation Numbers: C72, D83.
منابع مشابه
Regret Testing: A Simple Payoff-Based Procedure for Learning Nash Equilibrium∗
A learning rule is uncoupled if a player does not condition his strategy on the opponent’s payoffs. It is radically uncoupled if the player does not condition his strategy on the opponent’s actions or payoffs. We demonstrate a simple class of radically uncoupled learning rules, patterned after aspiration learning models, whose period-byperiod behavior comes arbitrarily close to Nash equilibrium...
متن کاملGlobal Nash convergence of Foster and Young's regret testing
We construct an uncoupled randomized strategy of repeated play such that, if every player plays according to it, mixed action profiles converge almost surely to a Nash equilibrium of the stage game. The strategy requires very little in terms of information about the game, as players’ actions are based only on their own past payoffs. Moreover, in a variant of the procedure, players need not know...
متن کاملRegret testing: learning to play Nash equilibrium without knowing you have an opponent
A learning rule is uncoupled if a player does not condition his strategy on the opponent’s payoffs. It is radically uncoupled if a player does not condition his strategy on the opponent’s actions or payoffs. We demonstrate a family of simple, radically uncoupled learning rules whose period-by-period behavior comes arbitrarily close to Nash equilibrium behavior in any finite two-person game.
متن کاملDeep Learning Games
We investigate a reduction of supervised learning to game playing that reveals newconnections and learning methods. For convex one-layer problems, we demonstratean equivalence between global minimizers of the training problem and Nashequilibria in a simple game. We then show how the game can be extended to generalacyclic neural networks with differentiable convex gates, establis...
متن کاملOn the Value of Randomizing and Limiting Memory in Repeated Decision-Making under Minimal Regret
We search for behavioral rules that attain minimax regret under geometric discounting in the context of repeated decision making in a stationary environment where payo¤s belong to a given bounded interval. Rules that attain minimax regret exist and are optimal for Bayesian decision making under the prior where learning can be argued to be most di¢cult. Minimax regret can be attained by randomiz...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006